Dimension reduction for high-dimensional data.

نویسنده

  • Lexin Li
چکیده

With advancing of modern technologies, high-dimensional data have prevailed in computational biology. The number of variables p is very large, and in many applications, p is larger than the number of observational units n. Such high dimensionality and the unconventional small-n-large-p setting have posed new challenges to statistical analysis methods. Dimension reduction, which aims to reduce the predictor dimension prior to any modeling efforts, offers a potentially useful avenue to tackle such high-dimensional regression. In this chapter, we review a number of commonly used dimension reduction approaches, including principal component analysis, partial least squares, and sliced inverse regression. For each method, we review its background and its applications in computational biology, discuss both its advantages and limitations, and offer enough operational details for implementation. A numerical example of analyzing a microarray survival data is given to illustrate applications of the reviewed reduction methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning gradients on manifolds

A common belief in high dimensional data analysis is that data is concentrated on a low dimensional manifold. This motivates simultaneous dimension reduction and regression on manifolds. We provide an algorithm for learning gradients on manifolds for dimension reduction for high dimensional data with few observations. We obtain generalization error bounds for the gradient estimates and show tha...

متن کامل

بهبود مدل تفکیک‌کننده منیفلدهای غیرخطی به‌منظور بازشناسی چهره با یک تصویر از هر فرد

Manifold learning is a dimension reduction method for extracting nonlinear structures of high-dimensional data. Many methods have been introduced for this purpose. Most of these methods usually extract a global manifold for data. However, in many real-world problems, there is not only one global manifold, but also additional information about the objects is shared by a large number of manifolds...

متن کامل

Visual Hierarchical Dimension Reduction for Exploration of High Dimensional Datasets

Traditional visualization techniques for multidimensional data sets, such as parallel coordinates, glyphs, and scatterplot matrices, do not scale well to high numbers of dimension. A common approach to solving this problem is dimensionality reduction. Existing dimensionality reduction techniques usually generate lower dimensional spaces that have little intuitive meaning to users and allow litt...

متن کامل

On Comparing the Effects of Different Transformations for Visualization of High-Dimensional Data

This paper presents a visualization tool, VisHD, which can visualize the spatial distribution of vector points in high dimensional feature space. It is important to handle high dimensional information in many areas of computer science. VisHD provides several methods for dimension reduction in order to map the data from high dimensional space to low dimensional one. Next, this system builds intu...

متن کامل

Comparative Study of Dimension Reduction Approaches With Respect to Visualization in 3-Dimensional Space

In the present big data era, there is a need to process large amounts of unlabeled data and find some patterns in the data to use it further. If data has many dimensions, it is very hard to get any insight of it. It is possible to convert high-dimensional data to low-dimensional data using different techniques, this dimension reduction is important and makes tasks such as classification, visual...

متن کامل

Analysis of linear and nonlinear dimensionality reduction methods for gender classification of face images

Data in many real world applications are high dimensional and learning algorithms like neural networks may have problems in handling high dimensional data. However, the Intrinsic Dimension is often much less than the original dimension of the data. Here, we use fractal based methods to estimate the Intrinsic Dimension and show that a nonlinear projection method called Curvilinear Component Anal...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Methods in molecular biology

دوره 620  شماره 

صفحات  -

تاریخ انتشار 2010